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Abstract 

We define an algorithm to be the set of programs that implement or 
express that algorithm. The set of all programs is partitioned into equiv- 
alence classes. Two programs are equivalent if they are essentially the 
same program. The set of equivalence classes forms the category of algo- 
rithms. Although the set of programs does not even form a category, the 
set of algorithms form a category with extra structure. The conditions we 
give that describe when two programs are essentially the same turn out to 
be coherence relations that enrich the category of algorithms with extra 
structure. Universal properties of the category of algorithms are proved. 



Keywords: Formal algorithms, equivalence of programs, operads, Grze- 
gorczyks hierarchy. 



1 Introduction 

In their excellent text Introduction to Algorithms, Second Edition [5], Gorman, 
Leiserson, Rivest, and Stein begin Section 1.1 with a definition of an algorithm; 

Informally, an algorithm is any well-defined computational proce- 
dure that takes some value, or set of values, as input and produces 
some value, or set of values, as output. 

Three questions spring forward: 

1. "Informally"? Can such a comprehensive and highly technical book of 
1180 pages not have a "formal" definition of an algorithm? 

2. What is meant by "well-defined?" 

3. The term "procedure" is as vague as the term "algorithm." What is a 
"procedure?" 
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Knuth [131 m] has been a little more precise in specifying the requirements 
demanded for an algorithm. But he writes "Of course if I am pinned down and 
asked to explain more precisely what I mean by these remarks, I am forced to 
admit that I don't know any way to define any particular algorithm except in a 
programming language." ([H], page 1.) 

Although algorithms are hard to define, they are nevertheless real mathemat- 
ical objects. We name and talk about algorithms with phrases like "Mergesort 
runs in n Ig n time". We quantify over all algorithms, e.g., "There does not exist 
an algorithm to solve the halting problem." They are as "real" as the number 
e or the set Z. See [TU] for an excellent philosophical overview of the subject. 

Many researchers have given definitions over the years. (Refer to [S] for a 
historical survey of some of these definitions. One must also read the important 
works of Yiannis Moschovakis, e.g., [2Q].) Many of the given definitions are of 
the form "An algorithm is a program in this language/system/machine." This 
does not really conform to the current usage of the word "algorithm." Rather, 
this is more in tune with the modern usage of the word "program." They all 
have a feel of being a specific implementation of an algorithm on a specific 
system. Imagine a professor teaching a certain algorithm to a class and then 
assigning the class to go home and program the algorithm. In any class with the 
moral abhorrence of cheating, the students will return many different programs 
implementing the same algorithm. We would not call each of these different 
programs an algorithm. Rather the different programs are implementations of 
a single algorithm. And yet some researcher do call each of those programs a 
different algorithm, e.g. [6]. We would like to propose another definition. 

Consider Figure 1. 

At the bottom of the figure is the set of all functions. Two functions are 
highlighted: the sort function and the function that outputs the maximum of 
its inputs. On top of the figure is the set of all programs. For every function 
there is a set of programs that implement that function. We have highlighted 
four programs that implement the sort function: mergesorta and mergesortb 
are two different programs that implement the algorithm mergesort. Similarly 
quicksortx and quicksorty are two different implementations of the algorithm 
quicksort. There are also many different programs that implement the max 
function, mergesorta and mergesortb are grouped in one subset of all the 
programs that implement the sort function. This subset will correspond to 
the mergesort algorithm. Similarly, quicksortx and quicksorty are grouped 
together and will correspond to the quicksort algorithm. There are similar 
groupings for a binary search algorithm that finds the max of a list of elements. 
There are also other algorithms that find the max. This intuition propels us to 
define an algorithm as the set of all programs that implement the algorithm. 

We define an algorithm analogously to the way that Gottlob Frege defined 
a natural number. Basically Frege says that the number 42 is the equivalence 
class of all sets of size 42. He looks at the conglomerate of all finite sets and 
makes an equivalence relation. Two finite sets are equivalent if there is a one- 
to-one onto function from one set to the other. The set of all equivalence classes 
under this equivalence relation forms the set of natural numbers. For us, an 
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Figure 1: Programs, Algorithms and Functions. 

algorithm is an equivalence class of programs. Two programs are part of the 
same equivalence class if they are "essentially" the same. Each program is an 
expression (or an implementation) of the algorithm, just as every set of size 42 
is an expression of the number 42. 

For us, an algorithm is the sum total of all the programs that express it. In 
other words, we look at all computer programs and partition them into different 
subsets. Two programs in the same subset will be two implementations of the 
same algorithm. These two programs are "essentially" the same. 

What does it mean for two programs to be "essentially" the same? Some 
examples are in order: 

• One program might perform Processi first and then perform an unre- 
lated Process2 after. The other program will perform the two unrelated 
processes in the opposite order. 

• One program might perform a certain process in a loop n times and the 
other program will unwind the loop and perform it n — 1 times and then 
perform the the process again outside the loop. 

• One program might perform two unrelated processes in one loop, and the 
other program might perform each of these two processes in its own loops. 

In all these examples, the two programs are definitely performing the same 
function, and everyone would agree that both programs are implementations of 
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the same algorithm. We are taking that subset of programs to be the definition 
of an algorithm. 

Many relations that say when two programs are essentially the same will 
be given. However, it is doubtful that we have the final word on this. Hence 
the word "Towards" in the title. Whether or not two programs are essentially 
the same, or whether or not a program is an implementation of a particular 
algorithm is really a subjective decision. Different relations can be given for 
different purposes. We give relations that most people can agree on that these 
two programs are essentially the same, but we are well aware of the fact that 
others can come along and give more, less or different relations. The important 
realization is that the relations that we feel are the most obvious turn out to 
be relations that correspond to standard categorical coherence rules. When we 
mod-out by any set of relations, we get more structure. When wc mod-out by 
these relations, our set of programs become a category with more structure. 
Our goal is not to give the final word on the topic, but to point out that this is 
a valid definition of an algorithm and that the equivalence classes of algorithms 
has more structure than the set of programs. 

We consider the set of all programs which we might call Programs. An 
equivalence relation « of "essentially the sameness" is then defined on this set. 
The set of equivalence classes Programs/ « shall then be called Algorithms. 
There is a nice onto function from (p : Programs — > Algorithms, that takes 
every program P to the equivalence class (/'(P) = [P]. One might think of any 
function tp : Algorithms — >■ Programs such that (potp = -fc^Algorithms ^ 
"implementer." i/j takes an algorithm to an implementation of that algorithm. 

To continue with this line of reasoning, there arc many different algorithms 
that perform the same function. For example, Kruskal's algorithm and Prim's 
algorithm are two different ways of finding a minimum spanning tree of a 
weighted graph. Quicksort and Mergcsort are two different algorithms to sort 
a list. There exists an equivalence relation on the set of all algorithms. Two 
algorithms are equivalent if they perform the same function. We obtain 
Algorithms/ which wc might call Comp. Functions or computable func- 
tions. It is an undecidable problem to determine when two programs perform the 
same computable function. Hence we might not be able to effectively give the 
relation w', nevertheless it exists. Even if wc were able to give the relation, that 
would not mean that the word problem (i.e., telling when two different equiv- 
alence classes of descriptions are equivalent) is solvable. Nevertheless, there is 
an onto function <{>' ; Algorithms — Comp. FYinctions. 

We summarize our intentions with the following picture. 

Programing Computer Science Mathematics 



Programs » Algorithms »' Comp. Functions 

Programs are what programmers, or software engineers deal with. Algorithms 
are the domain of computer scientists. Computable functions are of interest to 
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pure mathematicians. 

With this picture in mind, we can explain other equivalence relations describ- 
ing program "sameness" . One can give many different equivalence relations but 
they must fall within the two extremes. One extreme says that no two programs 
are really the same, i.e., every program is essentially an algorithm. In that case 
Programs = Algorithms. This extreme case is taken up by [B]. In contrast, 
another extreme is to say that two programs are the same if they perform the 
same operation or are bisimilar. In that case Algorithms = Comp. Func- 
tions. In this paper we choose a middle way. Others can have other equivalence 
relations but they must fall in the middle. There are finer and courser equiva- 
lence relations than ours. There will also be unrelated equivalence relations. For 
every equivalence relation, the set of algorithms will have a particular structure. 

In our scheme. Programs will form a directed graph with a composition of 
arrows and a distinguished loop on every vertex. However they will not have 
the structure of a true category: the composition will not be associative and 
the distinguished loops will not act like the identity. In contrast. Algorithms 
will be a real category with extra structure: a Cartesian product structure 
and a weak parameterized natural number object (a categorical way of saying 
that the category is closed under recursion). This category will turn out to be 
an initial category in the 2-category of all categories with products and weak 
parameterized natural number objects. 

Others have studied similar categories before. Joyal in an unpublished 
manuscript about "arithmetical universes", (see [T7] for a history) as well as 
[7], [22] and [21] have looked at the free category with products and a strong 
natural number object. Marie-France Thibault [23] has looked at a Cartesian 
closed category with a weak natural number object. They characterized what 
type of functions can be represented in such categories. Although related cat- 
egories have been studied, the connection with the notion of an algorithm has 
never been seen. Nor has this category ever been constructed as a quotient of a 
syntactical graph. 

We are not trying to make any ontological statement about the existence of 
algorithms. We are merely giving a mathematical way of describing how one 
might think of an algorithm. Human beings dealt with rational numbers for 
millennia before mathematicians decided that rational numbers are equivalence 
classes of pairs of integers: 

Q = {(m,n) eZxZ\n^O}/ Ri 

where 

{m,n) « (m',n') iff mn' — nm' . 

Similarly, one can think of the existence of algorithms in any way that one 
chooses. We are simply offering a mathematical way of presenting them. 

There is a interesting analogy between thinking of a rational number as an 
equivalence class of pairs of integers and our definition of an algorithm as an 
equivalence class of programs. Just as a rational number can only be expressed 
by an element of the equivalence class, so too, an algorithm can only be expressed 
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by presenting an element of the equivalence class. When we write an algorithm, 
we are really writing a program. This explains the quote from Knuth's given in 
the beginning of this paper. Pseudo-code is used to allow for ambiguities and 
not show any preference for a language. But it is, nevertheless, a program. 

Another applicable analogy is just as a rational number by itself has no 
structure (it is simply an equivalence class of pairs of integers), so too, an 
algorithm has no structure. In contrast, the set of rational numbers has much 
structure. So too, the set (category) of algorithms has much structure. Q is 
the smallest field that contains the natural numbers. We shall see in Section 4 
that the category of algorithms is an initial category with a product and a weak 
natural number object. 

When a human being talks about a rational number, he prefers to use the 
pair (3,5) = | as opposed to the equivalent pair (6,10), or the equivalent 
(3000,5000). One might say that the rational number (3,5) is a "canonical 
representation" of the equivalence class to which it belongs. It would be nice if 
there was a "canonical representation" of an algorithm. We speculate further 
on this ideas in the last section of this paper. 

The question arises as to which programming language should we use? 
Rather than choosing one programming language to the exclusion of others, we 
look at a language of descriptions of primitive recursive functions. We choose 
this language because of its beauty, its simplicity of presentation, and the fact 
that most readers can easily become familiar with this language. The language 
of descriptions of primitive recursive functions basically has three operations: 
Composition, Bracket, and Recursion. A primitive recursive function can be de- 
scribed in many different ways. A description of a primitive recursive function is 
basically the same thing as a program in that it tells how to calculate a function. 
There is a basic correlation between programming concepts and the operations 
in generating descriptions of primitive recursive functions: recursion is like a 
loop, composition is sequential processing, and bracket is parallel processing. 
We are well aware that we are limiting ourselves because the set of primitive 
recursive functions is a proper subset of the set of all computable functions. By 
limiting ourselves, we are going to get a proper subset of all algorithms. Even 
though we are, for the present time, restricting ourselves, we feel that the results 
we will get are interesting in their own right. There is an ongoing project to 
extend this work to all recursive functions |19j . 



There is another way to view this entire endeavor. What we are creating here 
is an operad. Operads are a universal algebraic/categorical way of describing 
extra algebraic structure. Recently operads have become very popular with 
algebraic topologists and people who study quantum field theories. We are 
creating an operad that describes some of the extra structure that exists on 
the set of total functions of a certain type. With such total functions one can 
compose, do recursion, and take the product of those functions. We than can 
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look at the algebra of this operad generated by all total functions from powers 
of N to N. One then can examine the subalgebra generated by basic or initial 
functions (this essentially is our PRdesc) . We go further and look at a quotient 
of this subalgebra by using more relations (this essentially is our PRalg). We 
show in section 4 of this paper that this quotient subalgebra is an initial object 
in a certain 2-category. This operadic viewpoint is further elaborated and used 
in [19j where we tackle the harder problem of all recursive functions. 



There is a fascinating correspondence between this work and similar work in 
low-dimensional topology and related work in topological quantum field theory 
(TQFT). This correspondence is in the spirit of [2] and [T] where they show 
that using the powerful language of category theory there are many similar 
phenomena in low-dimensional topology, quantum physics, and logic. In order 
for us to express this correspondence, we are going to have to assume some 
knowledge of the basic yoga of low-dimensional topology. If this is not known, 
then simply skip this paragraph. For clarity's sake, we shall concentrate on the 
category of braids. However, we could have described similar correspondences 
with tangles, ribbons, cobordisms, etc. Similar to our three levels of structure. 

Programs »- Algorithms »- Comp. Functions 

there are three levels of objects in low-dimensional topology: 

Braid Projections Braid Groups »- Symmetric Groups. 

With these, there are the following analogies. 

• Just as we can only represent an algorithm by giving a program, so too, 
the only way to represent a braid is by giving a braid projection. 

• Just as our set of Programs does not have enough structure to form a 
category, so too, the set of Braid Projections does not have a worthwhile 
structure. One can compose braid projections sequentially and parallel. 
But there is no associativity. There are identity braids, but when se- 
quentially composed with other braid projections, they do not act like 
projections. There are inverse braid projections, but when sequentially 
composed with the original projection, there is no identity projection. 

• Just as we can get the category of algorithms by looking at equivalence 
classes of programs, so too, we can get braids by looking at equivalence 
classes of braid projections. With braid projections we look at Reidermeis- 
ter moves to determine when two braid projections are really the same. 
Here we look at relations stated in this paper to tell when two programs 
are the same. 

• Just as we are not giving the final word about what relations to use, so too, 
there is no final word about which Reidermeister moves to use. Depending 
on your choice, you will get braids, ribbons, oriented ribbons etc. 
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• Just as our category of Algorithms is tlic free category witlr products and 
a weak natural number object generated by the empty category, so too, 
the category of Braids is the free braided monoidal category generated 
by one object. 

• Just as we can go down to the level of functions by making two algorithms 

that perform the same function cqiiivalcnt, so to, wc can add a relation 
that two strings can cross each other and get the Symmetric Groups. 

• Just as the main focus of computer scientists arc algorithms and not pro- 
grams, so to, the main focus of topologists is braids and not braid dia- 
grams. 

There is obviously much more to explore with these analogies. There also should 
be a closer relationship between these fields. After all, some of our relations are 
very similar to Reidermeister moves. 



Section 2 will review the basics of primitive recursive functions and show how 
they may be described by special labeled binary trees. Section 3 will then give 
many of the relations that tell when two descriptions of a primitive recursive 
function are "essentially" the same. Section 4 will discuss the structure of 
the set of all algorithms. We shall give a universal categorical description of 
the category of algorithms. This is the only Section that uses category theory 
in a non-trivial way. Section 5 will discuss complexity results and show how 
complexity theory fits into our framework. We conclude this paper with a list 
of possible ways this work can progress. 

At this point it is appropriate to say what this paper is not. 

• We have no ambition to say anything new about primitive recursive func- 
tions. We are only using descriptions of primitive recursive functions as 
a simple programming language with three operations. Nor are we say- 
ing anything about a relationship between programming languages and 
primitive recursive functions. 

• Nothing new will be said about category theory. Rather, we are making 
a link of these categories and the concept of an algorithm. 

• Wc will not say anything new about program semantics. Our equivalence 
relations are between descriptions that correspond to the same function. 

Rather, what we are doing here is giving a novel definition of an algorithm 
and showing that the the set of all algorithms has more manageable structure 
than the set of all programs. We are also showing that categorical coherence 
relations correspond to rules saying when two programs are essentially the same. 
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Yuri Manin has incorporated an earlier draft [21] of this paper into his second 
edition of his A Course in Mathematical Logic |18j . Within Chapter IX of 
that book he describes the constructions given in this paper using the language 
of PROPs and operads that are of interest to mathematicians and theoretical 
physicists. This earlier draft P?l was also discussed in [Hj. 
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2 Descriptions of Primitive Recursive Functions 

Rather than talking of computer programs, per se, we shall talk of descriptions 
of primitive recursive functions. For every primitive recursive function, there 
are many different methods of "building-up" , or constructing the function from 
the basic functions. Each method is similar to a program. 

We remind the reader that primitive recursive functions N" — > N are "basic" 
or "initial" functions: 

• null ftmction n : N — > N where n{x) — 

• successor function s : N — > N where s{x) = x + 1 

• for each fc e N and for each i < k, a projection function 7rf : N*^ — >■ N 
where 7rf (a;i,a;2, . ..Xk) = x^ 

and functions constructed from basic functions through a finite number of com- 
positions and recursions. 

We shall extend this definition in two non-essential ways. An n— tuple of 
primitive recursive functions (/i, /2, • ■ • /n) : I^™ ^ I^", shall also be called a 
primitive recursive function. Also, a constant function /c : * — )■ N is called a 
primitive recursive function because for every A: € N, the constant map may be 
written assoso---oson. 

Let us spend a few minutes reminding ourselves of basic facts about recur- 
sion. The simplest form of recursion is for a given integer k and a function 
g : N — > N. From this one constructs ft, : N — > N as follows 

h{0) = k 
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hin + l)=g{h{n)). 

A more complicated form of recursion — and the one we shall employ — is 
for a given function / : N*^ — > N™ and a given function g : N'^ x — >■ N"*. 
From this one constructs /i : N'^ x N — > N™ as 

h{x,0)^f{x) 

h{x, n + 1) — g{x, h{x, n)) 
where x S N*^ and n € N. 

The most general form of recursion, and the definition usually given for 
primitive recursive functions is for a given function / : W' and a given 

function g iN'' x N™ x N -> N™. From this, one constructs /i : N'' x N -)■ N™ 

h{x,0)^f{x) 

h{x, n + 1) = g{x, h{x, n), n) 
where x and n € N. 

We shall use the middle definition of recursion because the extra input vari- 
able in g does not add anything |llj . It simply makes things unnecessarily 
complicated. However, we are certain that any proposition that can be said 
about the second type of recursion, can also be said for the third type. See [3] 
Section 7.5, and 4j Section 5.5. 

Although primitive recursive functions are usually described as closed only 
under composition and recursion, there is, in fact, another implicit operation 
for which the functions are closed: bracket. Given primitive recursive functions 
/ : N'^' — > N and g : N*^ -> N, there is a primitive recursive function h = (/, g) : 
N'' ^ N X N. his defined as 

h{x) = {fix),g{x)) 

for any x E N''. Wc shall see that having this bracket operation is almost the 
same as having a product operation. 

In order to save the eyesight of our poor reader, rather than writing too 
many exponents, we shall write a power of the set N for some fixed but arbitrary 
number as A, B, C etc. With this notation, we may write the recursion operation 
as follows: from functions / : A — > B and g : A x B — > B one constructs 
/i : A X N -J> B. 

If / and g are functions with the appropriate source and targets, then we 
shall write their composition as h = f o g. If they have the appropriate source 
and target for the bracket operations, we shall write the bracket operation as 
h = {f,g). We are in need of a similar notation for recursion. So if there are 
/ : A — > B and g : A x B — B we shall write the function that one obtains from 
them through recursion as h — fig : A x N — B 

We are going to form a directed graph that contains all the descriptions of 
primitive recursive functions. We shall call this graph PRdesc. The vertices 
of the graph shall be powers of the natural number N*^ — *, N, N^, N'^, . . .. The 
edges of the graph shall be descriptions of primitive recursive functions. One 
should keep in mind the following intuitive picture. 
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2.1 Trees 

Each edge in PRdesc shall be a labeled binary tree whose leaves are basic 
functions and whose internal nodes are labeled by C, R or B for composi- 
tion, recursion and bracket. Every internal node of the tree shall be derived 
from its left child and its right child. We shall use the following notation: 
5o/:A^C h = fig-.AxN^M (/,g):A^: 



C R B 



/:A^B g:M^C f:A^M g:AxM^M / : A B ^ 

PRdesc has more structure than a simple graph. There is a composition 
of edges. Given a tree / : A B and a tree g : B — > C, there is another 
tree 51 o / ; A — ^ C. It is, however, important to realize that PRdesc is not 
a category. For three composable edges, the trees ho [g o f) and [ho g) o f 
exist and they perform the same operation, but they are, nevertheless, different 
programs and different trees. There is a composition of morphisms, but this 
composition is not associative. 

Furthermore, for each object A of the graph, there is a distinguished mor- 
phism TT^ : A — >• A which does not act like an identity. It is simply a function 
whose output is the same as its input. 

2.2 Some Macros 

Because the trees that wc are going to constriict can quickly become large and 
cumbersome, we will employ several programming shortcuts, called macros. We 
use the macros to improve readability. 

Multiple Projections. There is a need to generalize the notion of a projection. 
The 7r|^ accept k inputs and outputs one number. A multiple projection takes 
k inputs and outputs m outputs. Consider A = N'^ and the sequence X = 
{xi,X2, . . . , Xm) where each Xi is in {1, 2, . . . , k}. Let B = N™, then for every X 
there exists tt^I = tt^ : A ^ B as 

In other words, tt^ outputs the proper numbers in the order described by X. 
Whenever possible, we shall be ambiguous with superscripts and subscripts. 
Setting 

X = /=(l,2,3,...,n) 
we have what looks like the identity functions. Setting 

X = A = (l,2,3,...,n, 1,2,3,. ..,n) 
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we get the diagonal function. 

Products. Wc would like a product of two maps. Given / : A — > B and 
g : C — )■ D, we would like / x£f:AxC— )-BxD. The product can be defined 
using the bracket as 

/ X 5 = (/ O TT^ ,9°T^C ) 



or in terms of trees 



f X g:Ax 



is defined (=) as the tree 



/x5 = (/o7rr",5o^-x-):AxC 



AxC\ 



B 



g o 7r^><^ : A x 



TT^^^ :AxC-^A /:A^B tt^^^ : A x C C g : C 



Diagonal Map. A diagonal map will be used. A diagonal map is a map 
A : A — )• A X A where x {x, x). It can be defined as 
A : A -s- A X A = (tt^, <) : A ^ A x A. 



7rf : A -)• A tt^ : A ^- A 
We took tlic' brackcrt operation as fundamental and from the bracket opcira- 
tion we derived the product operation and the diagonal map. We could have just 
as easily taken the product and the diagonal as fundamental and constructed 
the bracket as 




A X A. 
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Twist Map. Wc shall need to switch the order of inputs and outputs. The 
twist map shall be defined as 

tWA,M = TT^""" X TT^^" : A X B B X A. 

Or in terms of trees: 

tWA,v :AxB^BxA = tt*^" x tt^'^" : A x B ^ B x A 



tTb''" : a X B ^ B Trf" : A x B ^ A 

Second Variable Product. Given a function gi : A x B B and a function 
(72 : A X B — >• B, we would like to take the product of these two functions while 
keeping the first variable fixed. We define the operation 

giKlg2:AxBxB-)-BxB 

on elements as follows 

(51 ^ 52) (a, 61, 62) = (Si (a, &i), 52(0,^2))- 
In terms of maps, Kl may be defined from the composition of the following maps: 

51 ^ 52 = (51 X 92) O (tTa X tWA,B X TTb) O (A X 7r"^2) '■ 

AxBxB-)-AxAxBxB-)-AxBxAxB^BxB. 

Since the second variable product is related to the product which is derived 
from the bracket, we write it as 

5iKlg2:AxBxB-j>BxB 



5fi:AxB^-B 52:AxB-)-B 

Second Variable Composition. Given a function gi : A x D — >^ B and a 
function g'2 : A x C ^ D, we would like to compose the output of g2 into the 
second variable of gi . We define the operation 

giog2 : A X C ^ B 

on elements as follows 

(.9i°.92)(a,c) = gi{a,g2{a,c)). 
In terms of maps, 6 may be defined as the composition of the following maps 
91092 = {91) o (ttI X 52) o (A X 7rg) : 
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AxC-)>AxAxC-)-AxD^: 

We write second variable composition as 

gi6g2 : A X C ^ B 



C 



5f2:AxC^D fifi:AxB^B 

3 Relations 

Given the operations of composition, recursion and bracket, what does it mean 
for us to say that two descriptions of a primitive recursive function are "es- 
sentially" the same? We shall examine these operations, and give relations to 
describe when two trees are essentially the same. If two trees are exactly alike 
except for a subtree that is equivalent to another tree, then we may replace the 
subtree with the equivalent tree. 

3.1 Composition 

Composition is Associative. That is, for any three composable maps /, g 
and h, we have 

ho{gof)»{hog)of. 

In terms of trees, we say that the following two trees are equivalent: 

/i o o /) : A -J> D « {ho g) o f : A^B 



gof-.A^C h:C 



hog : 



C 



c 



9 ■ 



9 ■■ 



h : 



Projections as Identity of Composition. The projections irf and ttJ act 

like identity maps. That means for any / : A — > B, we have 



/ O TTa « / « TTb o /. 

In terms of trees this amounts to 

/o7r^:A^-B rj /:A^-B r 



7r» o / : A ^ 
C 



Composition and the Null Fiinction. The null function always outputs a 
no matter what the input is. So for any function / : A — ^ N, if we are going 
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to compose / with the null function, then / might as well be substituted with 
a projection, i.e., 

no f ^ no TT^. 



In terms of trees: 



no/ : A ^ N 



n o TT^ : A N. 



N 



n : N -)> N 



N n : N 



fi /2 • • • fk 

Notice that the left side of the left tree is essentially "pruned." Although 
there is much information on the left side of the left tree, it is not important. 
It can be substituted with another tree that does not have that information. 



3.2 Composition and Bracket 

Composition Distributes Over the Bracket on the Right. For £f : A — ^ B, 

/i : B — > Ci and /2 : B ^ C2, wc have 

{h,h)ogK {flog,f2og}. 

In terms of procedures, this says that doing g and then doing both fi and /2 is 
the same as doing both fiog and /2 o g, i.e., the following two flowcharts are 
essentially the same. 






Y 

/l 




/2 



In terms of trees, this amounts to saying that these trees are equivalent: 

(/i,/2)o5:A^Ci XC2 {hog, ho 9)-^ 



-s- Ci X C2 



B 



9 ■ 



{hJ2):- 



h X V2 



B 



/i 0.9 



Ci 



/2 o 5 : A C2 
C 



/i : B ^- Ci /2 : B ^- C2 



g:A^M /i:B^-Ci g : J 
It is important to realize that it does not make sense to require composition 
to distribute over bracket on the left: 



/2 : B ^ C2 



9o{h,h) {90 h,9o h)- 
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The following two flowcharts are not essentially the same. 




/i f: 



The left g requires two inputs. The right g's only require one. 



3.3 Bracket 

Bracket is Associative. The bracket is associative. For any three maps /, g, 
and h with the same domain, we have 



In terms of trees, this amounts to 

{{f,g)h) :A^-BxCx 



{{f,9),h)^{f,{g,h)) 

« {f,{g,h)) : A ^MxCx 



B 



B 



: A^Bx 



h:A 



{9,h}:: 



B 



B 



g :A^C h:A^: 



Bracket is Almost Commutative. It is not essential what is written in the 
flrst or the second place. For any two maps / and g with the same domain. 



In terms of trees, this amounts to 
(/,5):A^BxC « 



{f,g) « two{g,f). 

two{g,f) -.A^MxC 



B 



C 



9 ■■ 



(5,/):A^Cx: 



iw;:CxB^BxC 



B 



9 ■ 



Twist is Idempotent. There are other relations that the twist map must 
respect. Idempotent means 



tWA,M O tWA,M « TT^xB : A X B A X B. 
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Twist is Coherent. We would like the twist maps of three elements to get 
along with themselves. 

(tWM.C X 7rA)o (tTb X tWA,c)°{'tWA,M X TTc) ~ (tTc X tw a,k)° {tWA,C X TTsjo (tta X iwix) • 

This is called the hexagon law or the third Reidermeister move. Given the 
idempotence and hexagon laws, it is a theorem that there is a unique twist map 
made of smaller twist maps between any two products of elements ( |16| Section 
XI.4). 

Bracket and Projections. A bracket followed by a projection onto the first 
output means the second output is ignored: / w 7''*^'' o (/, g). In terms of trees, 
this amounts to 

/:A^B « 7r«>^Co(/,g) :A^B 

C 



b1 



/:A^B .g:A^C 
Similarly for a projection onto the second output: g « t^c^"^ ° {fi9)- 

Bracket and Identity. We want the bracket to be functorial, i.e., to respect 
the identity. 

(7r^,7r^) w A : A — ^ A x A 
3.4 Bracket and Recursion 

When there are two unrelated processes, we can perform both of them in one 
loop or we can perform each of them in its own loop. 



h={h{x),h{x)) 




hi=fiix) 




h2 = h{x) 


For i = 1 to n 




For i — 1 to n 




For i — 1 to n 


h = (51(2:, TTi ft,), 32(3;, 7r2/l)) 




hi = gi{x,hi) 




h2 .92(2;, ^2) 



In (J notation this amounts to saying 



h = (A, /2>tt(5i K 52) « (/itt5i, /2tt52) = {hi,h2). 
In terms of trees this says that this tree: 
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/i=((/i,/2)tl(5iK.92)) :AxN^Bx 



R 



(/i,/2) :A^Bx 



B 



B' 



A:A^B /2:A->B , ^ x B ^ B g, : . 
is equivalent (~) to this tree: 

{hi, h2) = (/itt5i, /2tt52) : A X N ^ B X ffi 



B 



X B ^- . 



hi = ifiigi) : A X N 



R 



h2 = (/2ttff2) : A X N 



R 



/i : A -J> B c/i : A X B -J> B /a : . 



52 : A X B -J> . 



3.5 Recursion and Composition 

Unwinding a Recursive Loop. Consider tlie following two algorithms 



h' =gi{x,,f{x)) 
For i = 1 to n-1 
h' = g2{x,h') 
h' = gi{x,h') 
h! = g2{x,h') 

This is the most general form of unwinding a loop. If gi is the identity 
process (does nothing), these become 



h = fix) 

For i = 1 to n 
h = gi{x,h) 
h = g2{x,h) 



h = fix) 
For i = 1 to n 
h = g2ix, h) 



h' = fix) 
For i — 1 to n-1 
h' g2{x,h') 
h'=g2ix,h'). 



If 52 is the identity process, these become 



h = fix) 

For i = 1 to n 

h = giix, h) 



h'=g,ixJix)) 
For i = 1 to n-1 
h' = gi{x,h'). 



In terms of recursion, the most general form of unwinding a loop, the left 
top box coincides with 
hix,0) = fix) 
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h{x,n + 1) = g2{x,gi{x, h{x,n))). 
The right top box coincides with: 

h'ix,0)=giixJix)) 

h'{x, 77. + 1) = .9i(x, g2{x. h' [x. n))). 
How are these two recursions related? We claim that for all n G N we have 
gi{x,h{x,n)) = h'{x,n). This may be proven by induction. The n = case is 
trivial. Assume it is true for k, and we shall show it is true for k + 1. 

gi{x,h{x,k+l)) = gi{x,g2{x,gi{x,h{x,k)))) = gi{x,g2{x,h'{x,k))) = h'{x,k+l). 

The first equality is from the definition of h; the second equality is the induction 
hypothesis; and the third equality is from the definition of h'. 

Although gi'oh and 52 are constructed differently, they are essentially the 
same program so we shall set them equivalent to each other: gioh w h' If one 
leaves out the h and h' and uses the ft notation, this becomes 

9i°{M92ogi)) « i9iofM9i°92)- 

In terms of trees, this means that 

gidh : A X N B 



/i : A X N 



R 



giiAxE^. 



g2°9i : A X B ^ . 



92 ■■ 



A X 



91 ■■ 



X JB ^ . 



is equivalent (~) to 



h' 



X N 



91° f 



c 



giog2 : A X B ^ . 



f -.A^M gi-.AxM^M g^-.AxM^M 5i:AxB^B 
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Recursion and Null. If h is defined by recursion from / and g, i.e. h = /t|(/, 
then by definition of recursion h{x,0) = f{x) or h{x,n{y)) = f{x) where n 
is the null function and y e N. This means hon = f. We shall set these 
equivalent hon « / Using the jj notation, this amounts to: (/tt.g)on ~ /. In 
terms of algorithms, this amounts to saying that the following two algorithms 
are equivalent: 



h = fix) 
For i= to 
h = g{x,h) 



h = fix) 



In terms of trees, this is 
(/ion) : A I 



n : N ^- N 



/i: AxN^: 



R 



/:A^B 5f:AxB^-B 
Notice that the g on the left tree is not on the right tree. 



Recursion and Successor. Let h be defined by recursion from / and g, 
i.e., h = fig. Then by definition of recursion: hix,k + 1) = gix,hix,k)) or 
hix,sik)) = gix,hix,k)) where s is the successor function and k gN. This is 
the same as has = goh. We shall set them equivalent hos « goh. Using the tJ 
notation, this becomes ifig)os ~ goif^g). In terms of algorithms, this says that 
the following two algorithms are equivalent 



h = fix) 
For i = 1 to k+1 
h = gix,h) 



h = fix) 
For i = 1 to k 
h = gix, h) 
h = gix, h) 



In terms of trees, this says that the following two trees are set equivalent 

/los : A X N N w goh: kxn 

^ [C' 



N 



s:N-s^N 



/i : A X N 



R 



h : 



A X N 



g:Ax 



g:Ax 



g:Ax 



Recursion and Identity. If 5 = tt^^", i.e., if we do recursion over the identity 
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function, then we are not really doing recursion at all. 
3.6 Products 

The product is associative. That is for any three maps / : A — >• A', 3 : B — >• 
B' and /i : C — )• C the two products are equivalent: 

/ X (3 X /i) « (/ X g) X /i : A X B X C A' X B' X C 

This follows immediately from the associativity of bracket. 

The product respects identity. 

^ B ^ ^AxB 
T^A ^ ~ TTaxB- 

This falls out of the fact that the bracket respects the identity. 

Interchange Rule. We must show that the product and the composition 
respect each other. In terms of maps, this corresponds to the following situation: 



Ai Ai X Bi ^ Bi 




A3 A3 X B3 B3 

if 2 X 52) o (/i X 91) and (/2 0/1) x {92° 9i) are two ways of getting from 
Ai X Bi to A3 X B3. We shall declare these two methods equivalent: 

(/2 X 52) o (/i X 51) w (/2 o /i) X (52 o 9-i)- 

In terms of trees, this tree: 
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(/2 X 52) o (/i X gi) 



Sl3 X 



/i X : Ai X Bi A2 X B2 /2 X 52 : A2 X B2 A3 X B3 



/i : Ai A2 gi:Mi^ B2 /2 : A2 A3 fifa : B2 : 
is equivalent (w) to this tree: 

(/2 o /i) X (52 o 51) : Ai X Bi A3 X B3 



/20/1 



92 091 



1 ■ 



/2 : A2 



ff2 : 



13 • 



One should realize that this equivalence is not anything new added to our 
list of equivalences. It is actually a consequence of the definition of product and 
the equivalences that we assume about bracket. In detail 

if 2 X 52) o (/i X gi) = (/27r,g27r) o (/i7r,pi7r) w (/27r(/i7r, piTr)), S'27r(/i7r, ^itt)) 

~ (/2 o /iTT, g'2 o Pitt) = (/a o /i) x (52 o 

The first and the last equality are from the definition of product. The first 

equivalence comes from the fact that composition distributes over bracket. The 
second equivalence is a consequence of the relationship between the projection 
maps and the bracket. 



4 Algorithms 

We have given relations telling when two programs/trees/descriptions are sim- 
ilar. We would like to look at the equivalence classes that these relations gen- 
erate. It will become apparent that by taking PRdesc and "modding out" by 
these equivalence relations, we shall get more structure. 

The relations split up into two disjoint sets: those for which there is a loss 
of information and those for which there is no loss of information. Let us call 
the former set of relations (I) and the latter set (II). The following relations 
are in group (I). 
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1. Null Function and Composition: no f » non^ 

2. Bracket and First Projection: / w 7rS'"^(/,fif) 

3. Bracket and Second Projection: g « 7r"^'''(/, (/) 

4. Recursion and Null Function: (/tt(7)on ~ / 

After setting these trees equivalent, there exists the following quotient graph 
and graph morphism. 

PRdesc ^ PRdesc/(I) 

In detail, PRdesc/ (I) has the same vertices as PRdesc, namely powers of the 

set of natural numbers. The edges arc equivalence classes of edges of PRdesc. 

Descriptions of primitive recursive functions which are equivalent to "pruned" 
descriptions by relations of type (I) we shall call "stupid descriptions". They 
are descriptions that are wasteful in the sense that part of their tree is ded- 
icated to describing a certain function and that function is not needed. The 
part of the tree that describes the unneeded function can be lopped off. One 
might call PRdesc/ (I) the graph of "intelligent descriptions" since within this 
graph every "stupid descriptions" is equivalent to another program without the 
wastefulness. 

We can further quotient PRdesc/(I) by relations of type (II): 

1. Composition Is Associative: f o [g o h) {f o g) o h. 

2. Projections Are Identities: / o tt^ « / ss ttJ o /. 

3. Composition Distributes Over Bracket: (/i, /2) o <? ~ {fi° g,f2° g)- 

4. Bracket Is Associative: {f,{g,h)) « {{f,9),h). 

5. Bracket Is Almost Commutative: {f,g) p^two {g, /). 

6. Bracket is functorial: (tt^, tt^) w A 

7. Twist Is Idempotent: twotw = n. 

8. Reidermeister III: 

(iWB,CX7'"A)o(7rBXtWA,c)o(i?^A,BX7rc) « (7rcXtWA,B)o(i«'A,CX7rB)o(7rAXit«B,c)- 

9. Recursion and Bracket: (/i, /2)tl(5'i ^ 92) ~ (/ittffi, /2tl32)- 

10. Recursion and Composition: gio{f^{g2°gi)) ~ igi°f)'i{9i°92)- 

11. Recursion and Successor Function: (/))s')os w go{f'(tg). 
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There is a further projection onto the quotient graph: 

PRdesc ^ PRdesc/(I) » PRalg = (PRdesc/I)/II = PRdesc/((I) U(II)) 

PRalg, or primitive recursive algorithms, are the main object of interest in this 
Section. 

What does PRalg look like? Again the objects are the same as PRdesc, 
namely powers of the set of natural numbers. The edges are equivalence classes 
of edges of PRdesc. 

What type of structure does it have? In PRalg, for any three composable 
arrows, we have 

f o{goh) = {f og)oh 
and for any arrow / : A ^ B we have 

That means that composition is associative and that the tt's act as identities. 
Whereas PRdesc was only a graph with a composition and identities that did 
not act like identities, PRalg is a genuine category. 

PRalg has more structure than only a category. For one, there is a strictly 
associative product. On objects, the product structure is obvious: 

N™ X N" = N™+". 

On morphisms, the product x was defined using the bracket above. The tt are 
the projections of the product. In PRalg the twist map is idempotent and 
coherent. The fact that the product respects the composition is expressed with 
the interchange rule. 

The category PRalg is closed under recursion. In other words, for any 
/ : A — ^ B and any <; : A x B — >• B, there exists an /i : A x N — B defined 
by recursion. The categorical way of saying that a category is closed under 
recursion, is to say that the category contains a weak parameterized natural 
number object. The simplest definition of a weak natural number object in a 
category is a diagram 

* s-N ^ 

such that for any fc € N and (7 : N — )• N, there exists an /i : N — >■ N such that the 
following diagram commutes. 

* >N >N 




h 



N >N 
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(See e.g. [31 21 [H]). Following [TS], we do not insist that the /i is a unique 
morphism that satisfies the condition. When there is uniqueness, we say that 
the natural number object is strong. Saying that the above diagram commutes 
is the same as saying that h is defined by the simplest recursion scheme. For 
our more general version of recursion, we require a weak parameterized natural 
number object, that is, for every / : A — B and 5 : A x B — )■ B there exists a 
ft, : A X N B such that the following two squares commute. 



X * ^ A X N 



A X N — 5- A X N 



A X 



From the fact that in PRalg we have an object N, the morphisms : * — > N 
and s : N — > N and these morphisms satisfy hon ~ {f\\g)°ri = f and hos — 
(/tJff)"'^ = 9°{fi9) — 9°h, we see that PRalg has a weak parameterized natural 
number object. 

Some words on the uniqueness of h are needed. Given descriptions / and g of 
the correct arity, we can form the description h = (/jjg). This h will satisfy the 
requirements of the parameterized natural number object. But there is no reason 
to think that this is the only description that would satisfy the requirements. 
Any other description of the same function that h performs would also satisfy 
the requirement. This is in sharp contrast to a category of functions. Given 
primitive recursive functions / and g of the right arity, there is only one function 
h — (fig) that satisfies the recursion axiom. One can think of this distinction 
as a fundamental difference between syntax and semantics. In a syntactical 
category, it is impossible to demand uniqueness. There are many descriptions 
of objects that satisfy conditions. In contrast, within semantic categories, there 
is only one object that satisfies requirements. In Lambek and Scott |15j . they 
deal with syntactical categories of proof and there too, they only have a weak 
natural number objects (page 46). Similarly, in Peter Johnstone's discussion 
of lambda-calculus in Proposition 4.2.12 on page 959 of volume II of [H], the 
natural number object in the syntactical category is weak. 

We must show that in PRalg, the natural number object respects the 
bracket operation. This fundamentally says that the central square in the fol- 
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lowing two diagrams commute. 



A X * A X N 




The left hand triangles commute from the fact that * is a terminal object. The 
right hand triangles commute because the equivalence relation forced the pro- 
jections to respect the bracket. The inner and outer quadrilateral are assumed 
to commute. We conclude that the central square commutes. 



A X N — s-A X N 




A X B 

92 

Similarly, the left and the right triangles commute because the projections act 
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as they arc supposed to. The inner and outer quadrilateral commute out of 
assumption. We conclude that central square commutes. 

We also must show that the natural number object respects the composition 
of morphisms. In (j notation this amounts to 

gi°{ft{9209i)) = {9i°f)t{9i°92)- 
For the simpler form of recursion, this reduces to 

91 o {^{92 o gi)) = {gi o k)i{gi o 5-2). 

Setting h = k'i{g2 o gi) and h' = {gi o fc)tt(5i o (72), we get the following natural 
number object diagram 



* >N 




With the properties of h and h' we get that the triangles commute. 

Once we have PRalg, we might ask when do two algorithms perform the 
same operation. We make an equivalence relation and say two algorithms are 
equivalent («') iff they perform the same operation. By taking a further quotient 
of PRalg we get PRfunc. What does PRfunc look like. The objects are again 
powers of the set of natural numbers and the morphisms are primitive recursive 
functions. 

In summary, we have the following diagram. 

PRdesc ^ PRdesc/(I) » PRalg = PRdesc/((I) U(II)) » PRfunc = PRalg/ w' . 

Let us spend a few moments with some category theory. There is the 
category Cat of all (small) categories and functors between them. Consider 
also the category CatXN. The objects are triples, (C, x , N) where C is a 
(small) category, x is a strict product on C and A'' is a weak parameter- 
ized natural number object in C. The morphisms of CatXN are functors 
F : (C, X, Af) — >• (C, x',N') that respect the product and the natural num- 
ber object. For F : C ^ C to respect the product, we mean that 

Forall/,5eC F{f x g) = F{f) x' F{g). 

To say that F respects the natural number object means that if 



^ ^ 
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is a natural number object in C and 

*' ^ N' ^ N' 

is a natural number object in C then F{N) = N',F{*) = *',F{0) = 0' and 
F{s) = s'. For a given natural number object in a category, there is an implied 
function fj that takes two morphisms / and g of the appropriate arity and outputs 
the unique h = f^g of the appropriate arity. Our definition of a morphism 
between two objects in CatXN implies that 

For all appropriate /, 5 G C F{ng) = F{m'F{g). 

There is an obvious forgetful functor U : CatXN — Cat that takes (C, x , N) 
to C. There exists a left adjoint to this forgetful functor: 

L 

Cat dZZZ^ CatXN. 

u 

This adjunction means that for all small categories C S Cat and D S CatXN 
there is an isomorphism 

CatXN(L(C),D) ~ Cat(C, ;7(D)). 

Taking C to be the empty category we have 

CatXN(L(0),B) ~ Cat(0, [/(D)). 

Since is the initial object in Cat, the right set has only one object. In other 
words L(0) is a free category with product and a weak parameterized natural 
number object and it is an initial object in the category CatXN. 
We claim that i(0) is none other then our category PRalg. 

Theorem 1 PRalg is an initial object in the category of categories with a strict 
product and a weak parameterized natural number object. 

We have already shown that PRalg is a category with a strict product 
and a natural number object. It remains to be shown that for any object 
(D, y.,N') e CatXN there is a unique functor Fo : PRalg D. Our task 
is already done by recalling that the objects and morphisms in PRalg are all 
generated by the natural number object and that functors in CatXN must 
preserve this structure. In detail, Fd(N) = A^' and since Fd must preserve 
products Fd(N*) = {N'y. And similarly for the morphisms of PRalg. The 
morphisms are generated by the its, the n and s in the natural number object 
of PRalg. They are generated by composition, product and recursion. Fjs is 
a functor and so it preserves composition. We furthermore assume it preserves 
product and recursion. (B, x , N') e CatXN might have many more objects and 
morphisms but that is not our concern here. PRalg has very few morphisms. 
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The point of this theorem is that PRalg is not simply a nice category where 
all algorithms live. Rather it is a category with much structure. The structure 
tells us how algorithms are built out of each other. PRalg by itself is not very 
interesting. It is only its extra structure that demonstrates the importance of 
this theorem. PRalg is not simply the category made of algorithms, rather, it 
is the category that makes up algorithms. 

PRfunc is the smallest category with a strict product and a strong param- 
eterized natural number object. 

Before we go on to other topics, it might be helpful to — literally — step away 
from the trees and look at the entire forest. What did we do here? The graph 
PRdesc has operations. Given edges of the appropriate arity, we can compose 
them, bracket them or do recursion on them. But these operations do not 
have much structure. PRdesc is not even a category. By placing equivalence 
relations on PRdesc, which are basically coherence relations, we are giving the 
quotient category better and more amenable structure. So coherence theory, 
sometimes called higher- dimensional algebra, tells us when two programs are 
essentially the same. 

5 Complexity Results 

An algorithm is not one arrow in the category PRalg. An algorithm is a scheme 
of arrows, one for every input size. We need a way of choosing each of these 
arrows. 

There arc many different species of algorithms. There are algorithms that 
accept n numbers and output one number. A scheme for such an algorithm 
might look like this: 




N2 



N3 



We shall call such a graph a star graph and denote it if. 

However there are other species of algorithms. There are algorithms that 
accept n numbers and output n numbers (like sorting or reversing a list, etc.) 
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Such a scheme looks like 

pjl pjl pj2 ^2 ... j^fe j^fe 

We shall also call such a graph a star graph. 

One can think of many other possibilities. For example, algorithms that 
accept n numbers and outputs their max, average and minimum (or mean, 
median and mode) outputs three numbers. We shall not be particular as to 
what what type of star graph we will be working with. 

Given any star graph -A", a scheme that chooses one primitive recursive 
description for each edge is a graph homomorphism Sch : — )■ PRdesc that 
is the identity on vertices, i.e., Sch{N'') = N' for all i gN. 

Composing Sch : —>■ PRdesc with the projection onto the equiva- 
lence classes PRdesc — )■ PRdesc/ (I) gives a graph homomorphism ir — )■ 
PRdesc/ (I). In order not to have too many names flying around, we shall 
also call this graph homomorphism Sch. Continuing to compose with the pro- 
jections, we get the following commutative diagram. 




PRdesc ^ PRdesc/ (I) ^ PRalg s*- PRfunc. 

We are not interested in only one graph homomorphism if PRdesc. 
Rather we arc interested in the set of all graph homomorphisms. We shall call 
this set PRdesc*. Similarly, we shall look at the set of all graph homomor- 
phisms from if to PRdesc/(I), which we shall denote (PRdesc/(I))*. There 
is also PRalg* and PRfunc*. There are also obvious projections: 

PRdesc* ^ (PRdesc/ (I))* ^ PRalg* ^ PRfunc* 



Perhaps it is time to get down from the abstract highland and give two 
examples. We shall present mergesort and insertion sort as primitive recursive 
algorithms. They are two different members of PRalg*. These two different 
algorithms perform the same function in PRfunc*. 

Exeunple: Mergesort depends on an algorithm that merges two sorted lists into 
one sorted list. We define an algorithm Merge that accepts m numbers of the 
first list and n numbers of the second list. Merge inputs and outputs m + n 
numbers. 

Mergeo,i{xi) = Mergeifi{xi) = nKxi) = xi 
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(.-^f^^fj^rn — l.n {x i ^ .T2 , . . . , S^m— 1 7 -^m+l i • • • 1 ^^^m+n) 7 -^m) ■ -^m ^ '^n 

With Merge defined, we go on to define MergeSort. MergeSort recursively 
splits the list into two parts, sorts each part and then merges them. 

MergeSorti{x) = 7r^(a;) = x 

MergeSortk{xi,X2, - . ■ ,Xk) = 
Merge^k/2jrk/2-^(Me'rgeSort^k/2j{xi,X2, x^k/2j), MergeSortrk/2-^{x^k/2j+i,x^k/2j+2, ■■■,Xk) 
We might write this in short as 

MergeSort = Merge o {MergeSort, MergeSort) 

□ 

Example: Insertion sort uses an algorithm Insert : N'^ X N ^- N*=+i which 
takes an ordered list of k numbers adds a k + 1th number to that list in its 
correct position. In detail, 

Inserto{x) = nl{x) = x 

Insertk{xi,X2, ...,Xk,x) = 

( {xi,X2, ...,Xk,x) : Xk<x 

\ {Insertk-i{xi,X2, ■ ■ ■ ,Xk-i,x),Xk) : Xk > x 

The top case is the function tt^ x ttJ and the bottom case is the function 
{Insertk-i x tt) o (tt^Zi x twn^fi). With Insert defined, we go on to define 
InsertionSort. 

Insertions orti(x) = 7r^(a:) = x 
InsertionSortk{xi,X2, ■ ■ ■ ,Xk) = Insertk-i{InsertionSortk-i{x\,X2, ■ ■ ■ ,Xk-i),Xk) 
We might write this in short as 

InsertionSort = Insert{InsertionSort x tt) 

□ 

The point of the these examples, is to show that although these two algo- 
rithms perform the same function, they are clearly very different algorithms. 
Therefore one can not say that they are "essentially" the same. 



Now that we have placed the objects of study in order, let us classify them via 
complexity theory. The only operations in our trees that are of any complexity 
is the recursions. Furthermore, the recursions are only interesting if they arc 
nested within each other. So for a given tree that represents a description of a 
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primitive recursive function, we might ask what is the largest number of nested 
recursions in this tree. In other words, we are interested in the largest number 
of "R" labels on a path from the root to a leaf of the tree. Let us call this the 
Rdepth of the tree. 

Formally, Rdepth is defined recursively on the set of our labeled binary trees. 
The Rdepth of a one element tree is 0. The Rdepth of an arbitrary tree T is 

Rdepth{T) = Max {Rdepth{left{T)), Rdepth{right{T))} + {label{T) [rT] ) 



where (laheliT) == R ) = 1 if the label of the root of T is R , otherwise it is 
0. 

It is known that a primitive recursive function that can be expressed by a 
tree with Rdepth of n or less is an element of Grzegorczyk's hierarchy class 
(See [8,, Theorem 3.31 for sources.) 

Complexity theory deals with the partial order of all functions {f\f : N — > 
M+} where 

gin) 

For every algorithm we can associate a function that describes the Rdepth 
of the trees used in that algorithm. Formally, for a given algorithm, A : if ^ 
PRdesc, we can associate a function : N — > M+ where 



fA{n) = Rdepth{A{cn)) 

when Cn is an edge in -k. The function PRdesc* — > {f\f : N — > M+} where 
A i-> /a shall be called Rdepth^. 
We may extend RdepthQ to 

Rdepthi : (PRdesc/(I))* ^ {f\f : N ^ M+}. 

For a scheme of algorithms [A] : -A" ^ (PRdesc/(I)) we define 

f[A]{n) = MinA'{Rdepth{A'{c,,))} 

where the minimization is over all descriptions A' in the equivalence class [A]. 
(For the categorical cognoscenti, Rdepthi is a right Kan extension of Rdepth^ 
along the projection PRdesc* — > (PRdesc/(I))*. 
Rdepthi can easily be extended to 

Rdepth^ : PRalg* ^ {f\f : N ^ M+}. 

The following theorem will show us that we do not have to take a minimum 
over an entire equivalence class. 



Theorem 2 Equivalence relations of type (II) respect Rdepth. 
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Proof. Examine all the trees that express these relations throughout this paper. 
Notice that if two trees are equivalent, then their Rdepths are equal. □ 
Rdepth2 can be extended to 

Rdepths ■■ PRfunc* {f\f : N R+}. 

We do this again with a minimization over the entire equivalence class (i.e. a 

Kan extension.) 

And so we have the following (not necessarily commutative) diagram. 



PRdesc* ^ (PRdesc/(I))* ^ PRalg* ^ PRfunc* 




{/|/:N^K+} 

Corollary 1 The center triangle of the above diagram commutes. 

This is in contrast to the other two triangles which do not commute. 

In order to see why the right triangle does not commute, consider an ineffi- 
cient sorting algorithm. Rdepth2 will take this inefficient algorithm to a large 
function N — )• M+. However, there are efficient sorting algorithms and Rdepths 
will associate a smaller function to the primitive recursive function of sorting. 

There are many subclasses of {f\f : N M+j like polynomials or expo- 
nential functions. Complexity theory studies the preimage of these subclasses 
under the function Rdepths. The partial order in {/|/ : N R+} induces a 
partial order of subclasses of PRfunc which are the "complexity classes." 



6 Future Directions 

We are in no way finished with this work and there are many directions that it 
can be extended. 

Extend to all Computable Functions. The most obvious project that we 

are pursuing is to extend this work from primitive recursive functions to all com- 
putable functions. In order to do this we must add the minimization operation. 
For a given g : A x N ^ N, there is an /i : A — )• N such that 

h{x) = Mirin {g{x, n) = 1} 

Categorically, this amounts to looking at the total order of N. This induces 
an order on the set of all functions from A to N. We then look at all functions 
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h! that make this square commute. 

A ' ^ * 

1 

A X N *-N 

i.e., 

g{x,h\x)) = l. 

Let /i : A — >■ N be the minimum such function. 

We might want to generahze this operation. Let / : A — > B and g : A x N — > 
B, then we define /i : A — > N to be the function 

h{x) = Afm„ {g{x,n) = J{x)) . 

Categoricahy, this amounts to looking at all functions h' that make the triangle 
commute: 

A 



A X N g 

i.e., 

g{x,h'{x))^f{x). 

Let /i : A — > N be the minimum such function. 

Hence minimization is a fourth fundamental operation: 



/;A^B .g:AxN^B 

There are several problems that are hard to deal with. First, we leave the 
domain of total functions and go into the troublesome area of partial functions. 
All the relational axioms have to be reevaluated from this point of view. Second, 
what should we substitute for Rdepth as a complexity measure? 

Progress is being made in this direction in a forthcoming paper by Yuri 
Manin and the author |19| . 

Other Types of Algorithms We have dealt with classical deterministic algo- 
rithms. Can we do the same things for other types of algorithms. For example, 
it would be nice to have universal properties of categories of non-deterministic 
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algorithms, probabilistic algorithms, parallel algorithms, quantum algorithms, 
etc. In some sense, with the use of our bracket operation, we have already dealt 
with parallel algorithms. Quantum algorithms are a little harder because the 
no-cloning theorem does not permit one to have a fully defined product which 
can lead to a diagonalization map x i— {x,x). 

More Relational Axioms. It would be interesting to look at other relations 
that tell when two programs are essentially the same. With each new relation, 
we will get different categories of algorithms and a projection from the old 
category of algorithms to the new one. With each new relation, one must find 
the universal properties of the category of algorithms. 

Canonical Presentations of Algorithms. Looking at the equivalent trees, 
one might ask whether there a canonical presentation of an algorithm. Perhaps 
we can push up the recursions to the top of the tree, or perhaps push the 
brackets to the bottom. This would be most useful for program correctness and 
other areas of computer science. 

In a sense, Kleene's Theorem on partial recursive functions is an example 
of a canonical presentation of an algorithm. It says that for every computable 
function, there exists at least one tree-like description of the function such that 
the root of the tree is the only minimization in the entire tree. 

When are Two Programs Really DifTerent Algorithms. Is there a way 
to tell when two programs are really different algorithms? There is a subbranch 
of homotopy theory called obstruction theory. Obstruction theory asks when 
are two topological spaces in different homotopy classes of spaces. Is there an 
obstruction theory of algorithms? 

Other Universal Objects in CatXN. We only looked at one element of 
CatXN namely PRalg. But there are many other elements that are worthy of 
study. Given an arbitrary function / : N — > N, consider the category C/ with 
N as its only object and / as its only non-trivial morphism. The free CatXN 
category over C j is the category of primitive recursive functions with oracle 
computations from /. It would be nice to frame relative computation theory 
and complexity theory from this perspective. 

Proof Theory. There are many similarities between our work and work in 
proof theory. Many times, one sees two proofs that are essentially the same. In 
a sense, Lambek and Scott's excellent book ^15^ has the proof theory version of 
this paper. They look at equivalence classes of proofs to get categories with extra 
structure. There is a belief that a program/algorithm implementing a function 
/ is a proof of the fact that f{x) — y. Following this intuition, there should be 
a very strong relationship between our work and the work done in proof theory. 
It would be nice to formalize this relationship. The work of Maictti (e.g. \T7\ ) 
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is in this direction. 

A Language Independent Definition of Algorithms. Our definition of 
algorithm is dependent on the language of primitive recursive functions. We 
could have, no doubt, done the same thing for other languages. The intuitive 
notion of an algorithm is language independent. Can we find a definition of an 
algorithm that does not depend on any language? 

Consider the set of all programs in all programming languages. Call this 
set Programs. Partition this set by the different programming languages that 
make the programs. So there will be a subset of Programs called Java, a 
subset called C++, and a subset PL/1 etc. There is also a subset called 
Primitive Recursive which will contain all the trees that we discussed in 
Section 3. There will be functions between these different subsets. We might 
call these functions (non-optimizing) compilers. They take as input a program 
from one programming language and output a program in another programming 
language. In some sense Primitive Recursive is initial for all the these sets. 
By initial we mean that there are compilers going out of it. There are few 
compilers going into it. The reason for this is that in C++ one can program the 
Ackermann function. One can not do this in Primitive Recursive. (There are, 
of course, weaker programming languages than primitive recursive functions, but 
we ignore them here.) 

For each subset of programs, e.g. Progsl, there is a an equivalence relation 
~Progsi or «i that tells when two programs in the subset are essentially the 
same. If C is a compiler from Progsl to Progs2 then we demand that if two 
programs in Progsl arc csscmtially the same, then the compiled versions of each 
of these programs will also be essentially the same, i.e., for any two programs 
P and P' in Progsl, 

We also demand that if there are two compilers, then the two compiled programs 
will be essentially the same, 

For all programs P, C(P) «2 C"(P). 

Now place the following equivalence relation = on the set Programs of all 
programs. Two programs are equivalent if they are the in the same programming 
language and they are essentially the same, i.e., 

P = P' if there exists a relation such that P «j P' 

and two programs are equivalent if they are in different programming languages 
but there exists a compiler that takes one to the other, 

P = P' if there exists a compiler C and C(P) = P'. 

We have now placed an equivalence relation on the set of all programs that 
tells when two programs are essentially the same. The equivalence classes of 
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Programs/= are algorithms. This definition does not depend on any preferred 
programming languages. There is much work to do in order to formulate these 
ideas correctly. It would also be nice to list the properties of Algorithms = 
Programs/=. 
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